Assessing the Quality of Natural Language Text Data

نویسنده

  • Daniel Sonntag
چکیده

We follow an empirical approach from data quality toward text quality, where the expectations of the consumer, human or machine, take the centre stage. We try to obtain numerical text quality statements which must be interpreted for the expectations of the user and suitability for automatic natural language processing (NLP) separately. We state that apart from text accessibility today only representational text quality metrics can be derived and computed automatically. Interestingly, text quality for NLP traces back to questions of text representation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessing the Quality of Persian Translation of Orwell’s Nineteen Eighty-Four Based on House’s Model: Overt-Covert Translation Distinction

This study aimed to assess the quality of Persian translation of Orwell's (1949) Nineteen Eighty-Four by Balooch (2004) based on House's (1997) model of translation quality assessment. To do so, 23 pages (about 10 percent) of the source text were randomly selected. The profile of the source text register was produced and the genre was realized. The source text profile was compared to t...

متن کامل

Assessing the Quality of Persian Translation of Orwell’s Nineteen Eighty-Four Based on House’s Model: Overt-Covert Translation Distinction

This study aimed to assess the quality of Persian translation of Orwell's (1949) Nineteen Eighty-Four by Balooch (2004) based on House's (1997) model of translation quality assessment. To do so, 23 pages (about 10 percent) of the source text were randomly selected. The profile of the source text register was produced and the genre was realized. The source text profile was compared to t...

متن کامل

Assessing the Quality of Persian Translation of the Book “Principles of Marketing” Based on the House’s (TQA) Model

Translation is evaluated in terms of its forms and functions inside the historically developed systems of the receiving culture and literature. This study aimed to evaluate the quality of Persian translation of the14th edition of the original English book “Principles of Marketing” written by Philip Kotler and Gary Armstrong based on House (TQA) model: overt and covert translation distinction. T...

متن کامل

Assessing the Quality of Unstructured Data: An Initial Overview

In contrast to structured data, unstructured data such as texts, speech, videos and pictures do not come with a data model that enables a computer to use them directly. Nowadays, computers can interpret the knowledge encoded in unstructured data using methods from text analytics, image recognition and speech recognition. Therefore, unstructured data are used increasingly in decision-making proc...

متن کامل

Assessing the Stylistic Properties of Neurally Generated Text in Authorship Attribution

Recent applications of neural language models have led to an increased interest in the automatic generation of natural language. However impressive, the evaluation of neurally generated text has so far remained rather informal and anecdotal. Here, we present an attempt at the systematic assessment of one aspect of the quality of neurally generated text. We focus on a specific aspect of neural l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004